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Abstract 

A statistical functional, such as the mean or the median, is called elicitable if there 
is a scoring function or loss function such that the correct forecast of the functional is 
the unique minimizer of the expected score. Such scoring functions are called strictly 
consistent for the functional. The elicitability of a functional opens the possibility to 
compare competing forecasts and to rank them in terms of their realized scores. In 
this paper, we explore the notion of elicitability for multi-dimensional functionals and 
give both necessary and sufficient conditions for strictly consistent scoring functions. 

We cover the case of functionals with elicitable components, but we also show that 
one-dimensional functionals that are not elicitable can be a component of a higher 
order elicitable functional. In the case of the variance this is a known result. However, 
an important result of this paper is that spectral risk measures with a spectral measure 
with finite support are jointly elicitable if one adds the ‘correct’ quantiles. A direct 
consequence of applied interest is that the pair (Value at Risk, Expected Shortfall) is 
jointly elicitable under mild conditions that are usually fulfilled in risk management 
applications. 
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1 Introduction 

Point forecasts for uncertain future events are issued in a variety of different contexts 
such as business, government, risk-management or meteorology, and they are often used 
as the basis for strategic decisions. In all these situations, one has a random quantity Y 
with unknown distribution F. One is interested in a statistical property of F, that is a 
functional T(F). Here, Y can be real-valued (GDP growth for next year), vector-valued 
(wind-speed, income from taxes for all cantons of Switzerland), functional-valued (path of 
the interchange rate Euro - Swiss franc over one day), or set-valued (area of rain tomorrow, 
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area of influenza in a country). Likewise, also the functional T can have a variety of 
different sorts of values, amongst them the real- and vector-valued case (mean, vector 
of moments, covariance matrix, expectiles), the set-valued case (confidence regions) or 
also the functional-valued case (distribution functions). This article is concerned with the 
situation where Y is a d-dimensional random vector and T is a £>dimensional functional, 
thus also covering the real-valued case. 


It is common to assess and compare competing point forecasts in terms of a loss 
function or scoring function. This is a function S such as the squared error or the absolute 
error which is negatively oriented in the following sense: If the forecast x £ R k is issued 
and the event y £ R d materializes, the forecaster is penalized by the real value S(x,y). 
In the presence of several different forecasters one can compare their performances by 
ranking their realized scores. Hen ce, foreca s ters have an incentive to minimize their Bayes 
risk or expected loss V)]. Gneitind ( 2011 ) demonstrated impressively that scoring 

functions should be incentive co mpatible in that they sho ul d encourage the foreca sters 
to issue truthful reports; see also Murphy and Daan ( 19851 ): Engelberg et al. ( 20091 ). In 
other words, the choice of the scoring function S must be consistent with the choice 
of the functional T. We say a scoring function S is /"-consistent for a functional T if 
T(F) £ argmin^ Kf[S(x, Y")] for all F £ F where the class T of probability distributions 
is the domain of T. If T(F) is the unique minimizer of the expected score for all F £ T we 
say that S is strictly F-c onsistent for T . Henc e, a strictly / "- consis tent scoring function 
for T elicits T. Following Lambert et al. ( 20081) and Gneitingl ( 2011 ). we call a functional 
T with domain F elicitable if there exists a strictly /'-consistent scoring function for T. 


The elicitabilit y of a functiona l allows for regression, suc h as quantile regression and ex¬ 
pect i le regression (jKoenkeri . 120051 : iNewev and Powelll.il 9871) an d for M-estimation (IHuberi. 
19641) . Early work on elicitability is due to lOsbandl ()1985l ): lOsband and Reichelstein 
(19 85l). More recent advances in the one-dimensional case, that is k = d = 1 are due 


to IGneiting ( 2011 ): Lambert ( 2013 ): Steinwart et al. ( 20141 ) with the latter showing the 
intimate relation between elicitability and identifiability. Under mild conditions, many 
important functionals are elicitable such as moments, ratios of moments, quantiles and 
expectiles. However, there are also relev ant functionals which are not elicitabl e such 
as variance, m ode, or Expected Shortfall ( Osband . 19851 : Weber . 2006 : Gneiting . 2011 
Heinrich . 20131 ). 


With the so-called revelation principle (see Proposition I2.13K Osband (j19851 ) was one 
of the first to show that a functional, albeit itself not being elicitable, can be a component 
of an elicitable vector-valued functional. The most prominent example in this direction 
is that the pair (mean, variance) is elicitable despite the fact that variance itself is not. 
However, it is crucial for the validity of the revelation principle that there is a bijection 
between the pair (mean, variance) and the first two moments. Until now, it appeared as an 
open problem if there are elicitable functionals with non-elicitable components other than 
t hose w hi ch can be connected to a functional with elicitable components via a bijection. 


Frong illo and Kash] ( 2015 ) conjectured that this is generally not possible. We solve this 


open problem and can reject their conjecture: Gorollarv 15.51 shows that the pair (Value at 
Risk, Expected Shortfall) is elicitable, subjec t to mild regularity assumptions, improving 
a recent partial result of Acerbi and Szekelv ( 2014 ). To the best of our knowledge, we 
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provide the first proof of this result in full generality. In fact, Corollary 15.41 demonstrates 
more generally that spectral risk measures with a spectral measure having finite support 
in (0,1] can be a component of an elicitable vector-valued functional. These results may 
lead to a new direction in the contemporary discussion about what risk measure is best in 
practi ce, and in particular about the importance of elic i t ability in ri s k measurement con¬ 
texts ( Embrechts and Hofert . 2014 : Emmer et, ah . 20131 : Davis . 20131 : Acerbi and Szekelv . 

201 4h . 


Complementing the question whether a functional is elicitable or not, it is interesting 
to determine the class of strictly consistent scoring functions for a functional, or at least 
to characterize necessary and sufficient conditions for the strict consistency of a scoring 
function. Most of the existing literature focuses on real-valued functionals meaning that 
k = 1. For the case k > 1, mainly linear functionals, that is, vectors of expectations of cer¬ 
tain transformation s, are classified where the only strictly consistent scoring functions are 
Breqman functions (Savage . 1971: Osband and Reichelsteinl. 1985; Dawid and Sebastiani . 


19991 : iBaneriee et all 2005 ; Abernethv and Frongillo , 2012h : for a general overvie w of the 
existin g literature, we r e fer tolGneitine ( 2011). To the b e st of our knowledge, only Osband 
( 1985)1] . Lambert et al. 1 2008h and Frongillo and Kash (2015) investigated more general 
cases of functionals, the latter also treating vectors of rati os of expectations as the first 
non-linear functionals. In his doctoral thesis, Osband (1985) established a necessary repre¬ 
sentation for the first order derivative of a strictly consistent scoring funct ion w ith r espec t 
to the report x which connects it with identification functions. Following Oneitin el ( 2011 1 
we call results in the same f lav or Os band’s principle. Theorem 13.21 in this paper comple¬ 
ments and generalizes Osband, (1 985 . Theorem 2.1). Using our techniques, we retrieve the 
results mentioned above concerning the Bregman representati on, however under somewhat 
stronger regularity assumptions than the one in Frongillo and Kashi (120151): see Corollary 
ECU On the other hand, we are able to treat a much broader class of functionals; see 
Proposition 14.11 Remark 14.41 and Theorem 15.21 In particular, we show that under mild 
richness assumptions on the class T, any strictly J r -consistent scoring function for a vector 
of quantiles and / or expectiles is the sum of strictly ^-consistent one dimensional scoring 
functions for each quantile / expectile; see Corollary 14.21 

The paper is organized as follows. In Section [2l we introduce notation and derive 
some basic results concerning the elicitability of ^-dimensional functionals. Section [3] 
is concerned with Osband’s principle, Theorem 13.21 and its immediate consequences. We 
investigate the situation where a functional is composed of elicitable components in Section 
[H whereas Section [5] is dedicated to the elicitability of spectral risk measures. We end our 
article with a brief discussion; see Section El Most proofs are deferred to Section [71 


2 Properties of higher order elicitability 

2.1 Notation and definitions 

Following Gneiting (2011), we introduce a decision-theoretic framework for the evaluation 
of point forecasts. To this end, we introduce an observation domain 0 C W l . We equip 


3 
































































































0 with the Borel cr-algebra O using the induced topology of W 1 . We identify a Borel 
probability measure P on (0, O ) with its cumulative distribution function (cdf) Fp: 0 —>■ 
[0,1] defined as Fp(x ) := P((— oo,x] nO), where (—oo,x] = (—oo,xi] x ••• x (—oo ,Xd\ for 
x = (xi,..., Xd) £ R d . Let T be a class of distribution functions on (0, O). Furthermore, 
for some integer k > 1, let A C be an action domain. To shorten notation, we usually 
write F £ F for a cdf and also omit to mention the cr-algebra O. 

Let T : F —> A be a functional. We introduce the notation T(F) := {x £ A: x = 
T(F) for some F £ F}. For a set M C R fc we will write int(M) for its interior with 
respect to M fc , that is, int(M) is the biggest open set U C R k such that U C M. The 
convex hull of M is defined as , 

n n 

conv(M) := j ^ \xi \ n £ N, xi,... ,x n £ M, Ai,..., A n > 0, ^ A* = 1 j. 

2—1 1=1 

We say that a function a: 0 —> R is F-integrable if it is F-integrable for each F £ T. 
A function g: A x 0 —> R is F-integrable if g(x, •) is F-integrable for each x £ A. If g is 
F-integrable, we introduce the map 

g: Ax 7->I, (, x,F ) <j(x,F) = J g(x,y)dF(y). 

Consequently, for hxed F £ F we can consider the function g(-, F): A £ 1, i £ g(x, F), 
and for fixed x £ A we can consider the (linear) functional g(x, ■): T —> R, F i —> g(x, F). 

If we fix y £ 0 and g is sufficiently smooth in its first argument, then for m £ {1,..., k} 
we denote the m-th partial derivative of the function g(-,y) with d m g(-, y ). More formally, 
we set 

d m g{-,y)- int(A) —» M, (x 1} ..., x k ) ^ -£^g(xi ,..., x k , y). 

We denote by Vg(-, y) the gradient of g(-,y) defined as Vg(-,y) := (dig(-,y), .. •, d k g{-, y )) T ; 
and with V 2 g(-,y) ■= (did m g{-, y)\ =1 k the Hessian of g(-,y). Mutatis mutandis , we 
use the same notation for g(-,F), F £ F. We call a function on A differentiable if it is 
differentiable in int(A) and use the notation as given above. The restriction of a function 
/ to some subset M of its domain is denoted by f\M- 

Definition 2.1 (Consistency). A scoring function is an J r -integrable function S: A x 0 —> 
R. It is said to be F-consistent for a functional T: F —> A if S(T(F), F) < S(x, F) for all 
F £ F and for all x £ A. Furthermore, S is strictly F-consistent for T if it is .F-consistent 
for T and if S(T(F),F) = S(x, F) implies that x = T(F ) for all F £ F and for all x £ A. 
Wherever it is convenient we assume that S(x, ■) is locally bounded for all x £ A. 

Definition 2.2 (A;-elicitability). A functional T: F —> A C is called k-elicitable , if 
there exists a strictly F-consistent scoring function for T. 

Definition 2.3 (Identification function). An identification function is an F-integrable 
function V: A x 0 —>• . It is said to be an F-identification function for a functional 

T : F —> A C R fc if V(T(F),F) = 0 for all F £ F. Furthermore, V is a strict F- 
identification function for T if V(x,F) = 0 holds if and only if x = T{F) for all F £ F 


4 


and for all x £ A. Wherever it is convenient we assume that V(x, •) is locally bounded for 
all a; € A and that V(-,y) is locally Lebesgue-integrable for all y £ 0. 

Definition 2.4 (fc-identifiability). A functional T : T —> A C is said to be k-identifiable, 
if there exists a strict ^-'-identification function for T. 


If the dimension k is clear from the context, we say that a functional is elicitable 
(identifiable) instead of fe-elicitable (^-identifiable). 

Remark 2.5. Depending on the class F, some statistical functionals such as quantiles can 
be set-valued. In such situations, one can define T: F —> 2 A . Then, a scoring function 
S: A x 0 — > M is called (strictly) J r -consistent for T if S(t,F ) < S(x,F ) for all x £ A, 
F £ F and t £ T(F) (with equality implying x £ T(F)). The definition of a (strict) 
.A-identification function for T can be generalized mutatis mutandis. Many of the results 
of this paper can be extended to the case of set-valued functionals - at the cost of a more 
involved notation and analysis. To allow for a clear presentation, we confine ourselves to 
functionals with values in in this paper. 

If V: A x 0 —^ is an ^'-identification function for a functional T: J 7 —>■ A and 

h: A — > M. kxk is a matrix-valued function, then the function 

hV : A x 0 —» R k , (x, y) e->• hV(x, y) := h(x)V(x, y) 


is again an ^-identification function for T. If V is a strict .A-identification function for T 
and det(/i(x)) 0 for all x £ A, then hV is also a strict J r -identification function for T. 


Remark 2.6. Steinwart et al. ( 2014l l introduced the notion of an oriented strict J r -identifi- 
cation function for the case k = 1 (and d = 1). They say that V: A x 0 —> M is an oriented 
strict J r -identification function for the functional T : T —> A if V is a strict J r -identification 
function for T and moreover 


V{x,F) > 0 


x > T(F) 


( 2 . 1 ) 


for all F £ T and for all x £ A. They show - under some regularity assumptions such as 
the continuity of the functional T - that if V is a str ict J -~-i den tifi cation function for the 
functional T then either V or —V is oriented; see [Steinwart et al.l (2014, Lemma 6). This 


notion of orientation can also be generalized to the case k > 1. 


Definition 2.7 (Orientation). Let T: F —> A be a functional with a strict J r -identification 
function V: A x 0 — > M k . Then V is called an oriented strict J r -identification function for 
T if 

v t V(T(F) + sv, F) > 0 s>0 

for all v £ §> k ~ 1 := {x £ M fc : ||x|| = 1}, for all F £ T and for all s £ R such that 
T(F) + sv £ A. 


Indeed, the one-dimensional definition of orientation at m is nested in Definition 
m upon recalling that S° = {—1,1}- Under some smoothness assumptions, we can give 
a necessary condition for the orientation of a strict J-'-identification function V: Assume 
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that the function A —»• M fc , x >->■ V(x, F ) is partially differentiable. If V is oriented then the 
matrix (5 iV r (t, F)} r l=1 k is positive semi-definite for all F £ F and t = T(F). It appears 
to be an open question under which conditions there exists an oriented identification 
function for an identifiable functional. In the light of Lemma 12.91 (ii), Remark 12.101 and 
Proposition [375] this would give insight whether the construction of a strictly proper scoring 
function is possible. 


Rema rk 2.8. Our notion of orientation differs from the one proposed by Frongillo and Kashi 
( 2015l l. In contrast to their definition, our definition is per se independent of a (possibly 
non-existing) strictly consistent scoring function for T. Moreover, with respect to Lemma 
12.91 (ii) and Remark 12.101 the orientation of the gradient of a scoring function implies its 
strict consistency. 


2.2 Basic results 


The first lemma gives a sufficient condition for strict consistency and connects the notions 
of scoring functions and identification functions. 

Lemma 2.9. (i) A scoring function S: A x 0 —> R is strictly T-consistent for T : F —> 

A C R fc if and only if the function 


i/j : D —> R, s i->- S(t + sv, F) 


has a global unique minimum at s = 0 for all F £ F, t = T(F) and v £ 1 where 

D = {s € M: t + sv £ A}. 


(ii) Let 5:AxO->Mfea scoring function that is continuously differentiable in its first 
argument and let T' = T _1 (int(A)) C T. If VS: int(A) x 0 — > R fc is an oriented 
strict T'-identification function for T\jr, then ,S) i n t(A)xO a strictly T'-consistent 
scoring function for T\ji . 


Remark 2.10. One can weaken the assumptions of Lemma 12.91 (ii) on the smoothness of 
S. Let S: A x O -> R be a scoring function such that S(-,F) is continuously differentiable 
for all F £ F. If T consists of absolutely continuous distributions, this is a much weaker 
requirement; see Section [3] for a detailed discussion. Let F' = T _1 (int(A)) C T. If for all 
F' £ F, t = T(F) e int(A), for all v £ S fc_1 and for all s € R such that t + sv £ int(A) we 
have that 


v T VS(t + sv,F ) < 


> 0 , 
= 0 , 
L< 0 , 


if s > 0 
if s = 0 
if s < 0 


then S) i nt (A)xO i s a strictly J 7/ - CO nsistent scoring function for Tjjv. 


The following result follows directly from the definition of consistency lDefinition l2.il) . 
However, it is crucial to understand many of the results of this paper. 


Lemma 2.11. Let T: J 7 —>■ ACM 1 be a functional with a strictly F-consistent scoring 
function S: A x O —> R. Then the following two assertions hold. 
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(i) Let J 7 ' C T and Tijv be the restriction of T to J 7 '. Then S is also a strictly T'- 
consistent scoring function for Tj jv. 

(ii) Let A' C A such that T(7 7 ) C A' and S^ x q be the restriction of S to A' x 0. Then 
S'ia'xO a/so a strictly J 7 -consistent scoring function for T. 


The main results of this paper consist of necessary and sufficient conditions for the strict 
/"-consistency of a scoring function S for some functional T. What are the consequences 
of Lemma 12.111 for such conditions? Assume that we start with a functional T': J 7 ' —> 
A' C and deduce some necessary conditions for a scoring function S' \ A’ x O —> R 
to be strictly /"'-consistent for T'. Then Lemma 12.111 (i) implies that these conditions 
continue to be necessary conditions for the strict /"-consistency of S' for T : T —> A' where 
T' C J 7 , and T is some extension of T' such that T(J-) C A'. On the other hand, Lemma 
ELD (ii) implies that the necessary conditions for the strict /"'-consistency of a scoring 
function S': A' x 0 K continue to be necessary conditions for the strict /"'-consistency 
of S': Ax0->1 for T' , where A 1 C A and S is some extension of S'. 

Summarizing, given a functional T: J 7 —> A. a collection of necessary conditions for 
the strict /"-consistency of scoring functions for T is the more restrictive the smaller the 
class T and the smaller the set A is (provided that /(/") C A, of course). Hence, in 
the forthcoming results concerning necessary conditions, it is no loss of generality to just 
mention which distributions must necessarily be in the class J 7 to guarantee the validity 
of the results. Furthermore, it is no loss of generality to make the assumption that T is 
surjective, so A = /(/"). 

Some of the subsequent results also provide sufficient conditions for the strict /"- 
consistency of a scoring function S: A x 0 —> R for a functional T: J 7 —» A. Those results 
are the stronger the bigger the class J 7 and the bigger the set A is. For the notion of 
elicitability this means that the assertion that a functional T: J 7 —>• A is elicitable is also 
the stronger the bigger the class J 7 and the bigger the set A is. To demonstrate this 
reasoning, observe that if the functional T: J 7 —>• A is degenerate in the sense that it 
is constant, so T = t for some t € A (which covers the particular case that J 7 contains 
only one element), then T is automatically elicitable with a strictly J r -consistent scoring 
function S': A x 0 —> R, defined as S(x,y) := \\x — t[|. 

Strictly consistent scoring functions for a given functional T are not unique. In 
particular, the following result generalizes directly from the one-dimensional case. Let 
S:AxO—>Rbea strictly J r -consistent scoring function a functional T : J 7 —> A. Then, 
for any A > 0 and any J r -integrable function a: 0 —>• M, the scoring function 


S(x,y) := AS(x, y) + a{y) 


( 2 . 2 ) 


is again strictly J r -consistent for T. Gneitinj ( 2011 . Theorem 2) shows that in the one¬ 
dimensional case under the assumption S(x, y ) > 0, the class of consistent scoring functions 
Generally, the assumption of scoring functions being nonnegative is 


is a convex cone. 


natural if 5 V £ J 7 for all y € 0 because for an /’-consistent scoring function S, the scoring 


function S{x,y) := S(x,y) — S(T(5 y ), 5 y ) > 0 and it is of the form (|2.2I) if y S(T(5 y ),5 y ) 
is J-’-integrable. As we are particularly interested in classes J 7 of absolutely continuous 
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distributions in this ma nuscri pt, we do not require scoring functions to be nonnegative. 
We generalize Gneiting ( 201 1. Theorem 2) as follows showing that the class of strictly F- 
consistent scoring functions for T is a convex cone (not including zero). The proof follows 
easily using Fubini’s theorem and is omitted. 


Proposition 2.12. Let T : F —x A C be a functional. Let (Z,Z) be a measurable 
space with a a-finite measure v where v / 0. Let {S z \ z € Z} be a family of strictly 
F -consistent scoring functions S z : A x 0 — x M for T. If for all x G A and for all F € T 
the map Zx0->1, (z,y) S z {x,y), is v (g> F-integrable, then the scoring function 


S: Ax0->1, 
is strictly T-consistent for T. 


(x,y) S(x,y) = / S z (x,y)v(dz 
Jz 


Point forecasts and probabilistic forecasts are closely related. Probabilistic for ecasts, is¬ 
suing a whole probability distribut ion, can be evaluated in terms of scoring rules (jWinkleri . 
1996); Gneiting and Raftervl . 2007). A scoring rule is a map R: T x 0 —x R such that for 
each G € F, the map 0 —X M, y i-x R(G, y) is J r -integrable. A scoring rule is (strictly) 
.F-proper if R(F,F) < R (G , F ) for all F,G £ T (with equality implying F = G). As in 
the one-dimensional case ( Gneitinel . 2011 . Theorem 3), each J r -consistent scoring function 
S for a functional T: T —x A C induces an J-'-proper scoring rule R via 


R: F x 0 -X 


(F,y) e-x R(F,y) = S(T(F),y). 


However, if we do not impose that the functional T is injective, we cannot conclude that 
R is a strictly J r -proper scoring rule even if the scoring function S is strictly ^-'-consistent. 

Many important statistical functionals are transformations of other statistical function¬ 
als, for example variance and first and second mo ment are r elated in this manner. The 
following re velatio n principle, which originates from Osband (1198a . p. 8) and is also given 


m 


Gneitind (2011, Theorem 4) states that if two functionals are related by a bijection, then 
one of them is elicitable if and only if the other one is elicitable. The assertion also holds 
upon replacing ‘elicitable’ with ‘identifiable’. We omit the proof which is straightforward. 

Proposition 2.13 (Revelation principle). Let g: A —x A 7 be a bijection with inverse g" 1 , 
where A, A' C M fc . Let T: F —x A be a functional. Then the following two assertions hold. 


(i) The functional T: T —» A is identifiable if and only if T g = g o T: F —x N is 

identifiable. The function V : A x 0 —X is a strict F-identification function for T 

if and only if 

V g : A 1 x 0 -x M fc , (x',y) ^ V g (x',y) = V(g~ 1 (x'),y) 
is a strict F-identification function for T g . 

(ii) The functional T: F —X A is elicitable if and only ifT g = goT: F —X A 1 is elicitable. 
The function 5: A x O -xR is a strictly F-consistent scoring function for T if and 
only if 

S g : A'xO-xl, (x 1 , y) e-x S g {x', y) = S(g^ 1 {x'),y) 
is a strictly F-consistent scoring function for T g . 



















We remark that also (jGneiting, 2011, Theorem 5) on weighted scoring functions carries 
over directly to the higher order case. Furthermore, convexity of level sets continues to be 
a necessary condi tion f or el icit ability. The result is cl assical i n the literature and was first 
presented in Osbandl (119851 . Proposition 2.5); see also iGneiting ([201 lj. Theorem 6). 


Proposition 2.14 (Osband). LetT: F —> A C be an elicitable functional. Then for all 
Fq, Fi £ F with t := T(Fq) = T{F\) and for all A £ (0,1) such that F\ := (1 —A)Fo + AFi £ 
T it holds that t = T(F\). 


As a last result in this section, we present the intuitive observation that a vector of 
elicitable functionals itself is elicitable. 

Lemma 2.15. Let k\,... ,ki > 1 and let T m : T —> A m C be a k m -elicitable functional, 
m £ {1,...,/}. Then the functional T = (T\,... ,Ti): F —>• A is k-elicitable where 
k = k\ + • • • + ki and A = Ai x • • • x A/ C M fc . 


Proof. For m £ (1,...,(} let S m : A m x 0 —> R be a strictly ^-consistent scoring function 
for T m . Let Ai,..., A; > 0 be positive real numbers. Then 

S: Ai x ■ • • x A; x O R, (2.3) 

i 

(xi,...,xi,y) S(xi,...,x h y) := ^ A m S m {x m ,y) 

m= 1 

is a strictly ^-'-consistent scoring function for T. □ 


A particularly simple and relevant case of Lemma [245] is the situation k\ = ■ ■ ■ = ki = 1 
such that k = l. It is an interesting question whether the scoring functions of the form 
m are the only strictly J r -consistent scoring functions for T, whi ch amou nts to th e 
question of separability of scoring rules that was posed by Frongillo and Kash (20151 '). 
The answer is generally negative. As mentioned in the introduction, it is known that 
all Bregman f unction s elicit T, if the components of T are all expectations of transfor- 


Baneriee et al. 


mations of Y ( Savage! . Il97ll: lOsband and Reichelsteinl . Il985l : iDawid and Sebastian! Il999l : 


2005 : Abernethv and Fr ongillol . 20121 ) or ratios of expectations with the 


same denominator ( Frongillo and Kash . 2015h : see also Corollary 14.31 However, for other 


situations, such as a combination of different quantiles and / or expectiles, the answer is 
positive; see Corollary 14.21 These results rely on ‘Osband’s principle’ which gives neces¬ 
sary conditions for scoring functions to be strictly J r -consistent for a given functional T ; 
see Section [3j 


There are more involved functionals that are fc-elicitable than just the mere combina¬ 
tion of k 1-elicitable components. To illustrate this with a first example, recall that the 
variance does not have convex level sets in the sense of Proposition 12.141 whence it is not 
elicitable. However, we can easily show that the pair (expectation, variance) is 2-elicitable. 


Corollary 2.16. Let F be a class of distribution functions on R with finite second mo¬ 
ments. Then, the functional T = (T\,T 2 ): F —> M 2 , defined as T\(F) = f R ydF(y), 
T 2 {F) = J R y 2 dF(y) - ( f R ydF(y )) 2 is 2-elicitable. 
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Proof. Let <f>: R —>• R, z i->- cj)(z) = z 2 /{ 1 + |z|). The scoring function Si : R x R —» R, 
(xi,y) i->- Si(xi,y) = < j)(y) — 4>{x\) — 4>'{xi)(y — x\) is a strictly ^-consistent scoring 
function for the expectation and S 2 : [0, 00 ) x R —> R, (x 2 ,y) S 2 (x 2 ,y) = 4>(y 2 ) — 

cj)(x 2 ) — 4>'(x 2 )(y 2 — X 2 ) is a strictly ^-consistent scoring function for the second moment. 
Hence, invoking Lemma [2.151 the pair (expectation, second moment) is 2-elicitable. Using 
the revelation principle given in Proposition 12.13l yields the assertion. □ 


In Section [5j we show that the concept of fc-elicitability is not restricted to function¬ 
als that can be o btain ed by combining L emma 12.151 and the revelation principle. It is 
shown in Webeil (2006, Example 3.4) and Gneitine ( 20 111 . Theorem 11) that the coher¬ 
ent risk measure Expected Shortfall at level a, a € (0,1), does not have convex level 
sets and is therefore not elicitable. In contrast, we show in Corollary 15.51 that the pair 
(Value at Risk Q , Expected Shortfall^) is 2-elicitable relative to the class of distributions 
on R with finite first moment and unique a-quantiles. This refutes Proposition 2.3 of 
Osbandl ( 19851 ): see Remark 15.31 for a discussion. 


3 Osband’s principle 


In this section, we give necessary conditions for the strict J r -consistency of a scoring 
function S for a functional T : T —> A. In the light of Lemma 12.111 and the discussion 
thereafter, we have to impose some richness conditions on the class F as well as on the 
‘variability’ of the functional T. To this end, we establish a link between strictly F- 
consistent scoring functions and strict J r -identification functions. We illustrate the idea 
in the one-dimensional case. Let J 7 be a class of distribution functions on R, T : T —> R a 
functional and S : R x R —> R a strictly J r -consistent scoring function for T. Furthermore, 
let V : R x R —>• R be an oriented strict ^-'-identification function for T. Then, under 
certain regularity conditions, there is a non-negative function h: R —> R such that 


dx 


S(x,y) = h(x)V(x,y). 


(3.1) 


If we naively swap differentiation and expectation and h does not vanish, the form (13.11) 
plus the identification property of V are sufficient for the first order condition on S(-,F), 
F € J-, to be satisfied and the orientation of V as well as the fact that h is positive are 
sufficient for S(-,F) to satisfy the second order condition for strict J r -consistency. So the 
really interesting part is to show that the form given in (13.11) is necessary for the strict 
J r -consistency of a scoring function for T. 


The idea of this characterization originates from lOsband (119851 1. He gives a charac¬ 
terization including Revalued functionals, but for his proof he assumes that F contains 
all distributions with finite support. This is not a problem per se , but in the light of 
Lemma 12 .11 1 and the discussion thereaf ter it would be desirable to weaken this assumption 
or to complement the result. Gneitin j ( 201 il l ill ustrates Osb and’ s pr inciple in a quite in¬ 


tuitive manner for the one-dimensional case. In ISteinwar t et al.l (120141 . Theorem 5) there 


is a rigorous statement of Osband’s principle for the one-dimensional case. We shall give 
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a proof in the setting of an Revalued functional that does not rely on the existence of 
distributions with finite support in F. 

Let F be a class of distribution functions on 0 C W l . Fix a functional T: F AC R fc , 
an identification function V: A x 0 — > R fc and a scoring function S: A x 0 —> R. We 
introduce the following collection of regularity assumptions. 

Assumption (VI). For every x £ int(A) there are F\, ..., £ T such that 


0 £ int (conv ({V(x, F\V(x, F k+ 1)})) • 


Remark 3.1. Assumption (MI]) implies that for every x £ int(A) there are l 7 !...., iq. £ F 
such that the vectors V ( x , F\),... ,V(x, Ff.) are linearly independent. 


Assumption (MU) ensures that the class T is ‘rich’ enough meaning that the functional 
T varies sufficiently in order to derive a necessary form of the scoring function S in 
Theorem 13. 21 We emphasize t hat a ssumptions like (MU) are classical in the literature. For 
the case of fc-eli c itabil ity, lOsbandl (|1985f ) assumes t hat 0 £ int (conv ({ V(x, y): y £ 0})). 
Steinwart et al. ( 2014 . Definition 8) and Lambert] (2013) treat the case k = 1 and work 


under the assumption that the functional is strictly locally non-constant which implies 
assumption (MU if the functional is identifiable. 

Assumption (V2). For every F £ F, the function V(-,F): A —> R fc , x i-» V(x,F), is 
continuous. 

Assumption (V3). For every F £ F, the function V(-,F ) is continuously differentiable. 


If the function x (->• V(x, y), y £ 0, is continuous (continuously differentiable), assump¬ 
tion (MU (assumption (MU) is directly satisfied, and it is even equivalent to (MU ((MU) if 
T contains all measures with finite support. However, (MU an d (MU are much weaker re¬ 
quirements if we move away from distributions with finite support. To illustrate this fact, 
let k = 1 and V(x,y) = l{y < x} — a, a £ (0,1), which is a strict ^-identification function 
for the a-quantile. Of course, V(-,y) is not continuous. But if T contains only probability 
distributions F that have a continuous derivative / = F' , then V(x,F) = F(x) — a and 
£V(x,F) = f(x) and V satisfies (MU an( i (MU- The following assumptions (SjU and (3U 
are similar conditions as (MU an d (MU hut for scoring functions instead of identification 
functions. 

Assumption (Si). For every F £ F, the function S(-,F ): A —>• R, x >->• S(x,F), is 
continuously differentiable. 

Assumption (S2). For every F £ F, the function S(-,F ) is continuously differentiable 
and the gradient is locally Lipschitz continuous. Furthermore, S(-,F ) is twice continuously 
differentiable at t = T(F) £ int(A). 


Note that assumption (MU implies that the gradient of S(-,F) is (totally) differen¬ 
tiable for almost all a: £ A by Rademacher’s theorem, which in turn indicates that the 
Hessian of S(-,F ) ex ists for almost all x £ A and is symmetric by Schwarz’s theorem; see 
Grauert and Fischer ( 19781 . p. 57). 


Theorem 3.2 (Osband’s principle). Let F be a convex class of distribution functions on 
O C R d . Let T : F —» A C be a surjective, elicitable and identifiable functional with a 
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strict T-identification function V: A x 0 —> M. k and a strictly F-consistent scoring function 
S: A x 0 —> R. If the assumptions (\{I ]) and ($7f) hold, then there exists a matrix-valued 
function h: int(A) —>■ M fcxfc such that for l G {1,..., k} 

k 

diS(x, F) = ^ h lm (x)V m (x, F ) (3.2) 

m=1 

for all x G int(A) and F G F. If in addition, assumption (\{$) holds, then h is continu¬ 
ous. Under the additional assumptions and the function h is locally Lipschitz 

continuous. 


The proof of Theorem 13.21 follows closely the idea of the proof of Osbandl (119851 . The¬ 
orem 2.1). However, the latter proof only works under the condition that the class T 
contains all distributions with finite support. He conjectures that the assertion also holds 
if T consists only of absolutely continuous distributions, but we do not believe that his 
approach is feasible f or this case. To show Theorem 13.21 we apply a similar technique as in 
the proof of Osbandl (1985, Lemma 2.2) which is based on a finite-dimensional argument. 


Remark 3.3. Let h: A —x M. kxk be a function such that the restriction /i|i n t(A) t° int(A) 
coincides with the function h in (13.21) . Then the function 


hV : A x 0 G R 1 ’, (x, y) i->- hV(x, y) = h(x)V(x, y) 


is an J r -identification function for T. If det(/i(x)) 7 ^ 0 for all x G A, then hV is even a 
strict ^-'-identification function for T. However, even if V is oriented, hV is not necessarily 
an oriented strict J r -identification function. 


Under the conditions of Theorem 13.21 equation (13.21) gives a characterization of the 
partial derivatives of the expected score. If we impose more smoothness assumptions 
on the expected score, we are also able to give a characterization of the second order 
derivatives of the expected score. In particular, one has the following result. 

Corollary 3.4. Let T be a convex class of distribution functions on O C W l . For a sur¬ 
jective, elicitable and identifiable functional T : T —> A C M. k with a strict F-identification 
function V : A x 0 —> and a strictly T-consistent scoring function S : A x O —> R that 
satisfy assumptions (\Uf), (M3\) and we have the following identities for the second 
order derivatives 


k 

d m diS(x,F) = y' j d m hi i (x)V i (x,F) + h u {x)d m Vi(x, F) (3.3) 

1=1 
k 

= ^ dih mi (x)Vi(x, F) + h mi (x)diVi(x, F) = did m S(x,F), 

1=1 

for all l,m G {1,... , k}, for all F G L and almost all x G int(A), where h is the matrix¬ 
valued function appearing at In particular, (13.31) holds for x = T(F ) G int(A). 
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Theorem 13.21 and Corollary 13.41 establish necessary conditions for strictly /"-consistent 
scoring functions on the level of the expected scores. If the class F is rich enough and the 
scoring and identification function smooth enough pointwise in the following sense, we can 
also deduce a necessary condition for S which holds pointwise. 

Assumption (FI). For every y G 0 there exists a sequence ( F n ) n of distributions 
F n G F that converges weakly to the Dirac-measure 5 y such that the support of F n is 
contained in a compact set K for all n. 

Assumption (VS1). Suppose that the complement of the set 

C := {(x,y) G A x O | V(x, •) and S(x, •) are continuous at the point y} 
has (k + oQ-dimensional Lebesgue measure zero. 

Proposition 3.5. Let T be convex. Assume that int(A) C is a star domain and 
let T: F —>■ A be a surjective, elicitable and identifiable functional with a strict F- 
identification function V: A x 0 —>• and a strictly F-consistent scoring function S: Ax 
O —> M. Suppose that assumptions (Id]), (f0§), (HU\), (id]) ciTid (VS1\) hold. Let h be the 
matrix valued function appearing at m - Then, the scoring function S is necessarily of 
the form 

k ^ ' rx r 

S(x,y) = EE h r m (x\,, x r —i, v, z r -f -1 , • • •, zjf) (3.4) 

r=1 m=1 J Zr 

x V m (x 1 ,.. .,x r -i,v,Zr+i, ■ ■ ■,z k ,y) dv + a(y) 

for almost all (x,y) G A x 0 for some star point z = (z\,... ,Zk) G int(A) and some 
F-integrable function a: 0 —> R. On the level of the expected score S(x,F), equation (13.41) 
holds for all x G int(T), FgE 

While Theorem 13.21 Corollary 13.41 and Proposition 13.51 only establish necessary condi¬ 
tions for strictly J r -consistent scoring functions for some functional T , often, they guide 
a way how to construct strictly J r -consistent scoring functions starting with a strict F- 
identification function V for T. For the one-dimensional case, one can use the fact that, 
subject to some mild regularity conditions, if V is a strict J r -identification function, then 
either V or — V is oriented; see Remark 12.61 Supposing that V is oriented, we can choose 
any strictly positive function h: A —> R to get the derivative of a strictly J r -consistent 
scoring function. Then integration yields the desired strictly J r -consistent scoring func¬ 
tion. 

Establishing sufficient conditions for scoring functions to be strictly /'-consistent for 
T is generally more involved in the case k > 1. First of all, working under assumption 
(SdJ), the symmetry of the Hessian \7 2 S(x, F) imposes strong necessary conditions on the 
functions hi m ; see for example Proposition 14. II which treats the case where all components 
of the functional T = (T),... ,Tjf) are elicitable and identifiable. The example of spectral 
risk measures is treated in Section [5j Secondly, (13.21) and (13.31) are necessary conditions 
for S(x,F ) having a local minimum in x = T(F), F G F. Even if we additionally suppose 
that the Hessian V 2 S(x,F) is strictly positive definite at x = T{F ), this is a sufficient 
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condition only for a local minimum at x = T(F), but does not provide any information 
concerning a global minimum. Consequently, even if the functions hi m satisfy m , one 
must verify the strict consistency of the scoring function on a case by case basis. This can 
often be done by showing that the one-dimensional functions R —>• R, s H> S(t + sv,F), 
with t = T(F), have a global minimum in s = 0 for all v £ S k _1 and for all F £ T. This 
holds for example if the function {x, y ) H)■ h(x)V(x, y ) is an oriented strict J r -identification 
function for T; see Lemma [2.91 In this step, one may have to impose additional conditions 
on the functions hi m to ensure sufficiency which cannot always be shown to be necessary. 

We conclude this section with a remark clarifying how the function h in Osband’s 
principle behaves under the revelation principle. 

Remark 3.6. Let g : A — > A' be a bijection, A, A ; C R fc . Suppose we have an identification 
function V for a functional T: J 7 —>■ A and we choose the identification function V g (x', y) = 
V(g~ 1 (x'),y) as an identification function for the functional T (J = g o T. If the functional 
T (and hence also T g by Proposition 12.13]) is elicitable, then the gradient of the expected 
scores of T and T g are of the form (13.21) with functions h and h g , respectively. The functions 
h and h g are connected by the following relation 

k 

(■ h g )lm{x') = ^ dl{g~ 1 )r{x')h r m{9~ 1 {x')), x' £ A'. 

r=1 


4 Functionals with elicitable components 


Suppose that the functional T = (T \,..., Tj.): R —> A C consists of 1-elicitable com¬ 
ponents T m . As prototypical examples of such 1-elicitable components, we consider the 
functionals given in Table Q] where we implicitly assume that 0 C R if a quantile or an 
expectile are a part of T. With the given identification functions, it turns out that usually 
T (or some subset of its components) fulfills either one of the following two assumptions. 

Assumption (V4). Let assumption (\^3j) hold. For all r £ and for all t £ 

int(A) n T{F) there are F\, F 2 £ T" 1 ({t}) such that 

d l Vi(t,F 1 ) = d l V l (t,F 2 ) V/ £ (1, ... ,k}\ {?’}, d r V r (t,F 1 )^d r V r (t,F 2 ). 

Assumption (V5). Let assumption (V[3|) hold. For all F £ F there is a constant cf ^ 0 
such that for all r £ {1,... , k} and for all x £ int(A) it holds that 

d r V r (x, F) = cp- 


Following iFrongillo and Kashi (J2015), we call a functional that fulfills assumption (V[5j) 
with cf = 1 for all F £ T a linear functional. 


Prima facie, assumptions (M1J) and (M5J) are mutually exclusive. Considering the 
functionals in Table |j] with the associated identification functions, we obtain, for x = 
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Table 1: Strict identification functions for k 


1; see lOneitind (|2Dlll . Table 


Functional 

Strict identification function 

Ratio E F \p(Y)]/R F [q(Y)] 
ct-Quantile 

T-Expectile 

V{x,y) = xq(y) - p(y) 

V(x,y) = t{y < x} - a 

V{x,y) = 2\t{y < x} - r\{x - y) 


9) 


(x\,..., Xk ) £ R fc , F £ F with derivative F' = f and m £ {1,..., k} 


dmVm (x,F) = < 


Qm{F), 

f {%m )) 

(2 - 4 T m )F(x m ) + 2 r„ 


if Vm(x, y) = x m q m (y ) - p m (y) 
if V m (x, y) = l{y < x m } - a m 
if V m (x,y) = 2|1 {y < x m } 

Fm|(^m ?/): 


where p m ,q m : 0 


are some J-untegrable functions such that q m (F) ^ 0 for all F £ 


F and a m ,T m £ (0,1). We see that (MS]) is satisfied if e.g. T is a vector o f ratios o f 


expec tations with the same denominator (compare the situation in Frongillo and Kash 
( 2015l V). In this situation, we have that c F = q{F). On the other hand, if the components 
of T are quantiles, expectiles with r m / 1/2 or ratios of expectations with different 
denominators and additionally the class F is rich enough, then (MU) might be satisfied. 


Proposition 4.1. Let T m : F —> A m C M be 1-elicitable and 1-identifiable functionals 
with oriented strict F-identification functions V m : A m xO->l for m £ {1 ,,k}. Let 
A := T(F) CAiX-x A k- Then V : A x 0 —> defined as 

V(x 1 ,...,x k ,y) = (Vi(xi,y),...,Vk{x k ,y)) T (4.1) 

is an oriented strict F-identification function for T = (Tf,..., Tk). 

Let F be convex and 5:AxO-}Iiea strictly F-consistent scoring function for 
T = (Ti,..., Tk). Suppose that assumptions /MU), /MU) and /MU) hold, and let h : int(A) — > 
M. kxk be the function given at (13.21) . Define A' m := {x m : 3(zi,..., Zk) £ int(A), z m = x m }. 


(i) If assumption (W holds and A is connected then there are functions g m : A' m —> R, 
m £ {1,..., k}, g m > 0, such that 

hmm (®1 j • ■ ■ j Xk) — gm (pCm) 

for all m £ {1,... , k} and (aq,..., Xk) £ int(A) and 

hri{x) = 0 (4.2) 

for all r, l £ {1,..., A’}, l ^ r, and for all x £ int(A). 

(ii) If assumption /MU) holds then 

dihrm(x) — d r hi m {x ), h r i(x) — hi r (x) (4-0) 

for all r,l,m £ {1,...,A} ; l / r, where the first identity holds for almost all x £ 
int(A) and the second identity for all x £ int(A). Moreover, the matrix ( h r i{x)) l =1 fc 
is positive definite for all x £ int(A). 
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A direct consequence of Proposition 14.11 (i) and Proposition 13.51 is the following char¬ 
acterization of the class of strictly ^-'-consistent scoring functions for functionals with 
elicitable components satisfying assumption (Ml]). In particular, it gives a characteriza¬ 
tion of the class of strictly J-'-consistent scoring functions for a vector of different quantiles 
and/or different expectiles (with t he excep tion of the 1/2-expectile), thus answering a 
question raised in Gneiting and Raftervl (2007, p. 370). 


Corollary 4.2. Let F be convex. Suppose that T = (Ti, ... ,7/): F — > A is a functional 
with 1-identifiable components having oriented strict F-identification functions. Assume 
that the interior of A := T(F) C A) x x A k is a star domain and that assumptions 
/Id]), /1GP, (MU), (MZ]) and (V$T\) hold for T. If assumption /l^J) holds, then a scoring 
function S: A x O —K is strictly F-consistent for T if and only if it is of the form 


k 

S(xi,...,x k ,y) = E Sm (%mn i y), (4-4) 

m= 1 


for almost all ( x , y) € A x 0, where S m : A m x 0 —> R, m £ {1,... , k}. are some strictly 
F-consistent scoring functions for T m . 


If we are in the situation of Proposition 14.11 (ii), that is, T satisfies assumption (MS]), 
it is well-known that a statement analogous to Corollary 14.21 is false. Let F G F and 
t = T(F). Recalling the orientation of the components V m , we can immediately deduce 
that there is cf > 0 such that V{t + sv,F) = cfsv for s € M and v € § fc_1 . Hence, one 
obtains 

v T h(t + sv)V(t + sv, F) = CFSV T h(t + sv)v. 

Consequently, if A is open and convex, the positive definiteness of h(x) for all x € A is a 
sufficient condition for the strict J r -consistency of S for T by Lemma 12.91 (i). Moreover, 
we now assume that T is a ratio of expectations with the same denominator q: 0 —> R 
implying that cp = q(F) for all F € F. Using Proposition 13.51 and partial integration, we 
obtain that for almost all (x,y) € A x 0 strictly J r -consistent scoring functions for T are 
of the form 

k 

S(x, y) = -<t>(x)q{y) + ^ V m {x, y)5 m 0(x) + a(y), (4.5) 

m= 1 

with 

rx r rv 

<f>(x) = y / / h rr (xi ,..., x r -\,w, z r+ \,..., z k )dwdv, (4.6) 

T = \ " z r ” %r 

where {z \,..., z k ) G A and a: 0 —X R is some ^-integrable function. Using (14.311 . it follows 
that the function cf has Hessian h. Therefore, for A open and convex, cf is strictly convex. 
Hence we have shown the following corollary. 

Corollary 4.3. Let F be convex. Let T = (T \,..., T) ; ): F —> A C be a ratio of 
expectations with the same denominator q: 0 —X R, q > 0. More specifically, let T be 
a surjective functional with 1-identifiable components with oriented strict identification 
functions V m : A m x 0 —> M, m € {1,..., k}, that fulfills assumption (\{5f). Suppose that 
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A C Ai x • • • x Afc is open and convex and that assumptions (\Uf, (MBf), ($M), (K ZD and 
(VSfTj) hold. Then, a scoring function S is strictly T-consistent for T if and only if it is of 
the form (14.51) for almost all (x,y) G A x 0 with a twice continuously differentiable strictly 
convex function : A —>■ R of the form (|4.6D and an T-integrable function a: 0 — > R. 


This corollary recover s resul ts of Osband and Reichelsteinl ( 19851 '): Baneriee et al. (2005); 
Abernethv and Frongillol ( 20121 ) if T is linear (meaning q = 1), which show that all con¬ 


sistent scoring functions for linear functionals are so-called Br eqman functions , t hat is 


functions of the form (14.51) with ^ = 1 and a convex function (f>. Fron gillo and Kashi (12013 . 


Theorem 13) also treat the case of more general functions q. Comparing these results with 
Corollary 14.31 one can see that on the one hand, they are stronger as they require weaker 
smoothness assumptions on the scoring function, but on the other hand, they are weaker 
since they assume that T contains all one-point distributions 6 y . 

Remark 4.4. One might wonder about necessary conditions on the matrix-valued function 
h in the flavor of Proposition 14. II if the k components of the functional T can be regrouped 
into (i) a new functional T[: T — > A( C R fc > with an oriented strict J r -identihcation 
function V [: A( x 0 —> R fe i which satisfies assumption (M2b aR d (u) several, say l, new 
functionals T' m : T —> /K' k , C R fc ™, m € {2,,...,/ +1} with oriented strict J r -identihcation 

functions V .^: A' m x 0 —> R fc ™ such that each one satisfies assumption ( V[5|) , and k[ + 
• • • + k' l+l = k. We can apply Proposition 14.11 to obtain necessary conditions for each of 
the ( k' m x /c^-valued functions h' m , m G {1,..., l + 1}. Applying Lemma 12.151 we get a 
possible choice for a strictly J r -consistent scoring function S for T. On the level of the 
k x fc-valued function h associated to S this means that h is a block diagonal matrix of the 
form diag(/i , 1 ,..., h' l+1 )- But what about the necessity of this form? Indeed, if we assume 
that the blocks in (ii) have maximal size (or equivalently that l is minimal) then one can 
verify that h must be necessarily of the block diagonal form described above. 


5 Spectral risk measures 


Risk measures are a common tool to measure the risk of a financial position Y. A risk 
measure is usually defined as a mapping p from some space of random variables, for 
example L°° , to the real line. Arguably, the most common risk measure in practice is 
Value at Risk at level a (VaR Q ) which is the generalized a-quantile T~ 1 (a), that is, 

VaR Q (y) := F _1 («) : = hif{x G R: F(x) > a}, 


where F is the distribution function of Y. An important alternative to VaR Q is Expected 
Shortfall at level a (ESq,) (also known under the names Conditional Value at Risk or 
Average Value at Risk). It is defined as 


ES a (Y) :=- f VaR u (V) du, a G (0,1], 
a Jo 


(5.1) 


and ESo(V) = ess inf Y. Since the influencial paper of Artzner et al. ( 19991 ') introducing 
coherent risk measures, there has been a lively debate about which risk measure is best in 


17 

























practice, one of the requirements under discussion being the coherence of a risk measure. 
We call a functional p coherent if it is monotone, meaning that Y < X a.s. implies 
that p(Y) < p(X); it is superadditive in the sense that p(X + Y) > p(X) + p(Y)] it 
is positively homogeneous which means that p(XY) = A p(Y) for all A > 0; and it is 
translation invariant which amounts to p(Y + a) = p(Y) + a for all a£l, In the literature 
on risk measures there are different sign conventions which co-exist. In this paper, a 
positive value of Y denotes a profit. Moreover, the position Y is considered the more 
risky the smaller p(Y) is. Strictly speaking, we have c hose n to work with utility functions 
instead of risk measures as for example in Delbaenl 1 2012 ). The risk measure p is called 
comonotonically additive if p(X + Y) = p(X) + p(Y) for comonotone random variables X 
and Y. C ohere nt and c omonotonically additive risk measures are also called spectral risk 
measures ( Acerbi . 2002h . All risk measures of practical interest are law-invariant, that is, 
if two random variables X and Y have the same law F, then p(X) = p(Y). As we are 
only concerned with law-invariant risk measures in this paper, we will abuse notation and 
write p(F) := p(X), if X has distribution F. 

One of the main c rit icisms on VaR Q is its failure to fulfill the super additivity prop¬ 
erty in general i Acerbi). 2 00211. Furt hermo r e, it fails to take the size of losses beyond 
the level a into account ( Danfelsson et all boOlh . In both of these aspects, ES a is a 


better alternative as it is coherent and comonotonically additive, that is, a spectral risk 
measure. However, with respect to rob ustness, some au thors argue that VaR a should be 
preferred over ES Q f Cont et ah . [2010 : iKou et, all 1201,il l, whereas others argue that the 
classical statis tical notions of robustnes s are not nec essarily appropriate in a risk measure¬ 
ment context (iKratschmer et ah . 2012 . 201, ’ll . 20141 )) . Finally, ES r , fails to be 1-elicitable 
(Weber, 2006 : iGneitind . 12011 1. whereas VaR Q is 1-elicitable for most classes of distribu¬ 


tions T of practial rele vance. In fac t, except for the expectation, all spectral risk measures 
fail to be 1-elicitable (IZiegell. 12015); further recent results on elicitable risk measures in¬ 
clude (|Kou and Pend . l2014l:IWang and Ziegell.l2015ll showing t hat d i stortion risk measure s 


are rarely elicitable and ( Weber . 2006; Bellini and Rignozzi . 2014 : Delbaen et al. . 20141 )) 


demonstrating that convex risk measures are only elicitable if they are shortfall risk mea¬ 


sures. 


We show in Theorem 15.21 (see also Corollary 15.41 and 15.51) that spectral risk measures 
having a spectral measure with finite support can be a component of a fc-elicitable func¬ 
tional. In particular, the pair (VaR Q , ES a ): F —> R 2 is 2-elicitable for any a £ (0,1) 
subject to mild con ditions o n the class T. We remark that our results substantially gen¬ 
eralize the result of Acerbi and Szekelv ( 20141 ') as detailed below. 


Definition 5.1 (Spectral risk measures). Let /ibea probability measure on [0,1] (called 
spectral measure) and let T be a class of distribution functions on M with finite first 
moments. Then, the spectral risk measure associated to p is the functional : T —>• M 
defined as 


v ti{F) := / ES a (F)p(da). 

J[ 0 , 1 ] 


Kusuokal ( 2001 ); Jouini et al. ( 2006l l have shown that law-invariant coherent and comono¬ 


tonically additive risk measures are exactly the spectral risk measures in the sense of Def- 
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inition l5dl for distributions with compact support. If y = 5 a for some a G [0,1], then 
v^F) = ES a (F). In particular, (F) = j ydF(y) is the expectation of F. 

In the following theorem, we show that spectral risk measures whose spectral measure 
y has finite support in (0,1) are fc-elicitable for some k. It is possible to extend the result 
to spectral measures with finite support in (0,1]; see Corollary 15.41 If y has mass at zero, 
we believe that is not fc-elicitable for any k with respect to interesting classes F. In 
this case, if the support of F is unbounded below, we have v^F) = ess inf (F) = —oo. 

Theorem 5.2. Let F be a class of distribution functions on M with finite first moments. 
Let : F —> M be a spectral risk measure where y is given by 


k -1 

A = Pmhqmi 
m— 1 

with p m G (0,1] , Ylrn=iPm = 1, Qm € (0,1) and the q m ’s are pairwise distinct. Define the 
functional T = (Ti,..., T k ): F —> where T m {F) := F^ 1 {q m ), m G {1,..., k — 1}, and 

Tk(F) := VyfF). Then the following assertions are true: 

(i) If the distributions in F have unique q m -quantiles, m G {l,...,fc — 1}, then the 
functional T is k-elicitable with respect to F. 

(ii) Let A D T(F) be convex and set A'. := {x r \ 3(zi,..., zjf) G A, x r = z r }, r G 
Define the scoring function S: A x R —y M by 


k- 1 

S(x,y) = ^2 (1 {y < x r } - q r )G r (x r ) - t{y < x r }G r (y) (5.2) 

r =1 

T Gkixjf) I Xk + ^ [ (l{?/ — V) Qm%m) ] 

V m=1 Qm J 

- Gk(x k ) + a(y), 

where a: R —>• R is F-integrable, G r : A(. —> M, r G {1,... ,k}, Q k \ A), —> M with 
Q' k = G k and for all r G {1,... ,k} and all x r G A], the functions l(oo ,x r ]^r are 
F-integrable. 

If Q k is convex and for all r £ {1,. ■ ■ ,k — 1} and x k G A' k , the function 

A x —^ R, x r i —y x r - — G k (x k ) -{- G r (x r ) (5.3) 

’ q r 

with A! r := {x r : 3(^i,..., z k ) G A,x r = z r ,x k = z k } is increasing, then S is 
F-consistent for T. If additionally the distributions in F have unique q m -quantiles, 
m G {1,..., k — 1}, Q k is strictly convex and the functions given at (15.31) are strictly 
increasing, then S is strictly F-consistent for T. 
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(in) Assume that the elements of F have unique q m -quantiles, m £ {1,... ,k — 1} and 
continuous densities. Define the function V : A x 1 —>• R fc with components 


V m (xi,.. .,x k ,y) = t{y < x m } -q m , me k-1}, 


k -1 


V k (xi,.. .,x k ,y) = x k - — yl{y < x m }. 

-i Qm 

m= 1 


(5.4) 


T/ien V is a strict F-identification function for T satisfying assumption (^B). 

If additionally F is convex, the interior of A := T(F) C R fc is a star domain, (MJf) 
and (lUD hold, and (V\,... ,V k -\) satisfies flQ), then every strictly IF-consistent 
scoring function S : A x R —> R for T satisfying (V$J\) is necessarily of the 

form given at m almost everywhere. Additionally, Q k must be strictly convex and 
the functions at (E3D must be strictly increasing. 


Remark 5.3. According to Theorem 15.21 the pair (VaR Q (i ? ), ES Q (F)), and more gener¬ 
ally (F~ l (qi),..., F^ 1 (q k _i), v^{F)), admits only non-separa ble strictl y cons istent scoring 
functions. This result gives an example demonstrating that Osbandl (1985, Proposition 
2.3) cannot be correct as it states that any strictly consistent scoring function for a func¬ 
tional with a quantile as a component must be separable in the sense that it must be 
the sum of a strictly consistent scoring function for the quantile and a strictly consistent 
scoring function for the rest of the functional. 


Using Theorem 15.21 and the revelation principle (Proposition 12.131) we can now state 
one of the main results of this paper. 


Corollary 5.4. Let F be a class of distribution functions on R with finite first moments 
and unique quantiles. Let F —>• R be a spectral risk measure. If the support of /j, is 
finite with L elements and contained in (0,1], then is a component of a k-elicitable 
functional where 


(i) k = 1, if n is concentrated at 1 meaning ^({1}) = 1; 

(ii) k = l + L, */>({!}) < 1. 


In the special case of T = (VaRa, ES a ), the maximal sensible action domain is Ao = 
{x € R 2 : x\ > X 2 } as we always have ES a (F) < VaR a (F). For this action domain, the 
characterization of consistent scoring functions of Theorem 15.21 simplifies as follows. 

Corollary 5.5. Let a € (0,1). Let F be a class of distribution functions on R with finite 
first moments and unique a-quantiles. Let Ao = {x £ R 2 : x\ > X 2 }. A scoring function 
5:A 0 xR-)Ro/ the form 


S(xi,x 2 ,y) = (l {y < xi} - o)Gi(xi) - t{y < xi}G\(y) 


+ G 2 (x 2 ) 



xi + -1 {y < xi}(xi - y) 
a 


~G 2 (x 2 ) + a{y), 


(5.5) 
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where Gi, G 2 , £ 2 , a: R —» M, Q' 2 = G 2 , a is F-integrable and l(-oo,zi]Gi is F-integrable 
for all x\ G R, is T-consistent for T if G\ is increasing and Q 2 is increasing and convex. 
If Q 2 is strictly increasing and strictly convex, then S is strictly T-consistent for T. 

Under the conditions of Theorem 1 5. ill (Hi) all strictly F-consistent scoring functions 
for T are of the form (ESI) almost everywhere. 


Acerbi and Szeke lv (120141 ) also give an example of a scoring function for the pair T = 
(VaR a ,ES Q ): F —> A C R 2 . They use a different sign convention for VaR a and ES a 
than we do in this paper. Using our sign convention, their proposed scoring function 
S w : A x R —>• R reads 


S w (x 1 , x 2 , y) = a(x 2 /2 + Wx\/2 - xix 2 ) (5.6) 

+ t{y < xi}( - x 2 (y - xi) + W(y 2 - x?)/2), 

where W G R. The authors claim that S'" is a strictly ^-consistent scoring function for 
T = (VaR Q ,ES a ) provided that 

ES a (F) > WVaR a (F) (5.7) 


for all F G T. This means that they consider a strictly smaller action domain than 
Ao in Corollary 15.51 They assume that the distributions in T have continuous densities, 
unique a quantiles, and that F(x) G (0,1) implies f(x) > 0 for all F G F with density /. 
Furthermore, in order to ensure that S'" (•, F) is finite one needs to impose the assumption 
that Jf oo y 2 dF(y) is finite for all x G R and F G T. This is slightly less than requiring 
finite second moments. As a matter of fact, they only show that VS l "(ti, U) = 0 
for F G T and (fi,t 2 ) = T(F) and that V 2 S'"(ti,t 2 , F) is positive definite. This only 
shows that S w (x,F ) has a local minimum at x = T(F) but does not provide a proof 
concerning a global minimum; see also the discussion after Corollary 13.41 However, we can 
use Theorem 15.21 (ii) to verify their claims with Gi(xi) = — (W/2)x\, f/ 2 (x 2 ) = [a/ 2)x$ 
and a = 0. Hence, Q 2 is strictly convex, and the function x\ x\ G 2 (x 2 )/a + Gi(xi) is 
strictly increasing in x\ if and only if x 2 > Wx\ as at E2D- 

The scoring function 5" has one property which is potentially relevant in applications. 
If xi,x 2 and y are expressed in the same units of measurement, then S’" / (xi,x 2 , y) is a 
quantity with these units squared. If one insists that we should only add quantities with 
the same units, then the necessary condition that xi ^ x\ G 2 (x 2 )/a + Gi(xi) is strictly 
increasing enforces a condition of the type (15.71) . The action domain is restricted for 5" 
and the choice of W may not be obvious in practice. Similarly, for the maximal action 
domain Ao, an open question of practical interest is the choice of the functions G\ and Q 2 
in (15.51) . We would like to remark that S remains stricly consistent upon choosing G\ = 0 
and Q 2 stricly increasing and strictly convex. 


6 Discussion 

We have investigated necessary and sufficient conditions for the elicitability of /c-dimen- 
sional functionals of d-dimensional distributions. In order to derive necessary conditions 
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we have adapted Osband’s principle for the case where the class F of distributions does 
not necessarily contain distributions with finite support. This comes at the cost of cer¬ 
tain smoothness assumptions on the expected scores S(-,F). For particular situations, 



tains all distributions with finite support, which is not necessary for the validity of our 
result. While this is not a great gain in the case of linear functionals or ratios of expec¬ 
tations it comes in handy when considering spectral risk measures. Value at Risk, VaR Q , 
being defined as the smallest a-quantile, is generally not elicitable for distributions where 
the a-quantile is not unique. Therefore, we believe that it is also not possible to show 
joint elicitability of (VaR Q , ES Q ) for classes F of distributions with non-unique a-quantiles. 
However, we can give at least consistent scoring functions which become strictly consis¬ 
tent as soon as the elements of F have unique quantiles. Fortunately, the classes F of 
distributions that are relevant in risk management usually consist of absolutely continuous 
distributions having unique quantiles. 


Emmer et al. ( 2013 *1 have remarked that ES a is conditionally elicitable. One can 


slightly generalize their definition of conditional elicitability as follows. 


Definition 6.1. Fix an integer k > 1. A functional T k : F —>• C R is called conditionally 
elicitable of order k if there are k — 1 elicitable functionals T m : F —>• A m C M, m € 
— 1}, such that T k is elicitable restricted to the class 


?xx,...,x k _ x :={F eF: Ti(F) = aq ,... ,T k -i{F) = x k -i} 
for any (aq,... ,x k -i) € Ai X • ■ ■ X A fc _i. 


Mutatis mutandis , one can define a notion of conditional identifiability by replacing the 
term ‘elicitable’ with ‘identifiable’ in the above definition. It is not difficult to check that 
any conditionally identifiable functional T k of order A; is a component of an identifiable 
functional T = (Tj,... , T k ). Spectral risk measures u tl with spectral measure n with finite 
support in (0,1) provide an example of a conditionally elicitable functional of order L + 1, 
where L is the cardinality of the support of //; see Theorem 15.21 However, we would like 
to stress that it is generally an open question whether any conditionally elicitable and 
identifiable functional T k of order k > 2 is always a component of a A;-elicitable functional. 


Slightly modifying Lambert et al. (.2008;, Definition 11), one could define the elicitabil¬ 
ity order of a real-valued functional T as the smallest number k such that the functional 
is a component of a £;-elicitable functional. It is clear that the elicitability order of the 
variance is two, and we have shown that the same is true for ES Q for reasonably large 
classes T. For spectral risk measures the elicitability order is at most L + 1, where L 
is the cardinality of the support; see Theorem 15.21 


In the one-dimensional case, Steinwart et ah (201411 have shown that having convex 
level sets in the sense of Proposition 12.141 is a sufficient condition for elicitability of a 
functional T under continuity assumptions on T. Withou t such continuity assumptions, 
the converse of Proposition 12.141 is generally false; see Heinrich (2013) for the example 
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of the mode functional. It is an open (and potentially difficult) question under which 
conditions a converse of Proposition 12.141 is true for higher order elicitability. 


7 Proofs 

Proof of Lemma 12.91 

The first part is a direct consequence of the definition of strict ^-'-consistency. For the 
second part, we use part (i) and consider ip\ D —> M, s S(t + sv,F) for t = T(F) £ 
int(A), v £ S k ~ 1 and D = {s £ R: t + sv £ int(A)}. The strict orientation of VS implies 
that i//(s) = v T S/S(t + sv, F) = 0 if s = 0, tp'{s) > 0 for s > 0 and 'ip'(s) < 0 for s < 0. □ 


Proof of Theorem 13.21 


Let x £ int(A). The identifiability property of V plus the first order condition stemming 
from the strict /'-consistency of S yields the relation V(x,F) = 0 ==> S7S(x,F) = 0 for 
all F £ T . Let l £ {1, ... ,k}. To show (13.21) . consider the composed functional 

B(x, ■): T -»• R k+1 , F ^ (, d t S(x , F), V{x, F)). 


By construction, we know that 


V(x,F) = 0 


B(x, F) = 0 


(7.1) 


for all F £ T. Assumption (MU implies that there are F\,. .. ,Fk+ 1 £ T such that the 
matrix 

V = mat (V(x, Pi), ..., V(x, F k+1 )) £ R fcx ( fc+1 ) 

has maximal rank, meaning rank(V) = k. If rank(V) < k, then span{F (x, F±),... ,V(x, F^ + 1 )} 
would be a linear subspace such that the interior of conv({F (x,F\),... ,V(x, P/c+i)}) would 
be empty. Let G £ T. Then still 0 £ int(conv({V’(a:, G), V(x, Pi),..., V(x, FV|_i)})), such 
that rank(Vc) = k where 

Vg = mat (V(x,G),V(x,Fi), ■ ■ ■ ,V{x,F k+1 )) £R h N). 


Define the matrix 




(d t S(x,G) 5 z S(x,Pi) 

V V G 


diS(x,F k+1 ] 




We use m to show that ker(B G ) = ker(V G )- First observe that the relation ker(B G ) C 
ker(V G ) is clear by construction. To show the other inclusion, let 0 £ ker(V G ) be an 
element of the simplex. Then (17.11) and the convexity of P yields that 9 £ ker(B G ). By 
linearity, the inclusion holds also for all 9 £ ker(V G ) with nonnegative components. Fi¬ 
nally, let 9 £ ker(V G ) be arbitrary. Assumption (MU) implies that there is 9* £ ker(V G ) 
with strictly positive components. Hence, there is an e > 0 such that 9* + e9 has nonnega¬ 
tive components. Since V G ($* + e9) = V G $* +£Yq9 = 0, we know that 9* + e9 £ ker(B G ). 
Again using linearity and the fact that 9* £ ker(B G ) we obtain that 9 £ ker(B G ). 
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With the rank-nullity theorem, this gives rank(Bc) = rank(Vq>) = k. Hence, there is 
a unique vector (hu(x), ... , hi k (x )) £ such that 

k 

diS(x,G ) = y h lm (x)Vm(x,G). 

m= 1 

Since G £ F was arbitrary, the assertion at (13.21) follows. 

The second part of the claim can be seen as follows. For x £ int(A) pick Fi,... , F k 6 J 
such that V(x, F\ ),..., V(x, Fk) are linearly independent and let Y(z) be the matrix with 
columns V(z,Fi ), i £ {l,...,fc} for z £ int(A). Due to assumption (VJ2J) or (M3|), V(z) 
has full rank in some neighborhood U of x. Let r £ and let e r be the rth 

standard unit vector of We define A(z) := Y(z)~ 1 e r for z £ U. Taking the inverse of 
a matrix is a continuously differentiable operation, so it is in particular locally Lipschitz 
continuous. Therefore, the vector A inherits the regularity properties of V(z,Fi), that is, 
under (V[2J) A is continuous, and under (VO A is locally Lipschitz continuous. Therefore, 
these properties carry over to h because for l £ {1,..., k}, z £ U 

k k k 

h ir(z) = Xj(z) y hi m (z)V m (z, Fi) = y\i(z)diS m (z,Fi) 

i =1 m= 1 i= 1 

using the assumptions on 5. □ 


Proof of Proposition 13.51 

Let x £ int(A), F £ F and let 2 £ int(A) be some star point. Using a telescoping argument 
we obtain 


S(x,F)-S(z,F) 


S(x i, ...,x k ,F)~ S(x !,... ,x fc _i ,z k ,F) 

+ S(x i,.. • ,x fc _i ,F) - S(x i,... ,x k - 2 Fk-i,Zk,F) 


+ ••• 

+ S(x 1 ,z 2 ,-. ■,z k ,F) - S(zi, ...,z k ,F) 

k ^ rx r 

y / d r S(xi,. .. ,x r -i,v,Zr + i,... ,z k ,F)dv. 


r=1 Zr 


Invoking the identity at (13.21) yields (13.41) for the expected scores with a(F) = S(z, F). We 
denote the right hand side of (13.41) minus a{y) by I(x, y ), hence I(x, F ) = S(x, F) — S(z, F). 

For almost all y £ 0, the set {x £ | (x,y) £ C c } =: A y has fc-dimensional Lebesgue 

measure zero, where C c is the complement of the set C defined in assumption (V^U)- 
Let y £ 0 be such that A y has measure zero. Then we obtain that for almost all x the 
sets {xi £ R | (x,y) £ A y } =: Ni have one-dimensional Lebesgue-measure zero for all 
% £ {1,..., k}. Therefore, S(x, •) and I(x, ■) are continuous in y for almost all x. 

Let (Fn)neN be a sequence as in assumption (FIT]) , that is, (F n ) n£ ^ converges weakly to 
5 y and the support of all F n is contained in some compact set K. Let ip be a function on 


24 




0 which is locally bounded and continuous at y. By the dominated convergence theorem 
and the continuous mapping theorem we get that then f Q <pdF n —>• tp(y). 

By this argument (recalling that S(x,-), V(x,-) are assumed to be locally bounded), 
if S(x, •) and I(x, •) are continuous at y, then S(x, F n ) — I(x, F n ) — > S(x, y) — I(x, y). We 
have shown that S(x, F n ) — I(x, F n ) does not depend on x, hence the same is true for the 
limit. Therefore, we can define a(y) = S(x,y) — I{x,y) for almost all y. The function a is 
J-nntegrable, since S and I are J r -integrable. □ 

Proof of Proposition 14.11 

It is clear that V given at (HUD is a strict J r -identification function for T. Also the 
orientation of V follows directly from its form and the orientation of its components. We 
have that diV r (x, F) = 0 for all l, r £ {1,... , k}, l / r, and x £ int(A), F £ T. Equation 
(13.31) evaluated at x = t = T(F) yields 

hrtWWfaF) = h lr (t)d r V r {t,F). (7.2) 

If (VSD holds then (17.21) implies that h r i(t) = 0 for r ^ l, hence we obtain (14.21) with the 
surjectivity of T. On the other hand, if (M5j) holds, (17.21) implies that h r i(t) = hi r (t), 
whence the second part of (14.31) is shown, again using the surjectivity of T. In both cases, 
(13.31) is equivalent to 


k 

^2 {dih rm (x) - d r hi m (x))V m (x,F) = 0. (7.3) 

m= 1 

Using assumption (MTJ) there are F\.... ,F^ € F such that V(x, T\),..., V(x, F^) are 
linearly independent. This yields that dih rm (x ) = d r hi m (x ) for almost all x £ int(A). For 
the first part of the Proposition, we can conclude that dih rr (x ) = d r hi r (x) = 0 for r / l 
for almost all x £ int(A). Consequently, invoking that A is connected, the functions h mm 
only depend on x m and we can write h mm (x ) = g m (x m ) for some function g m : /K' m —>• R. 
By Lemma [27171 (i), for v £ S fe_1 , t = T(F ) £ int(A), the function s i-4 S(t + sv, F) has a 
global unique minimum at s = 0, hence 

k 

V V S(t T sv, F) — N ( gm{tm T su m )V) n (t m T SV m , F^jV m 

m= 1 

vanishes for s = 0, is negative for s < 0 and positive for s > 0, where s is in some 
neighborhood of zero. Choosing v as the Zth standard basis vector of R fc we obtain that 
gi> 0 exploiting the orientation of Vi and the surjectivity of T. 

For the second part of the proposition, to show the assertion about the definiteness, 
observe that due to assumption (MS|), we have for v € §* -1 , t = T(F) £ int(A) that 
V m (t + sv, F) = c F sv m where cp > 0 due to assumption (V[5j and the orientation of each 
component of V. Hence, v T VS(t + sv,F) = cpsv T h(t + sv)v, which implies the claim 
using again the surjectivity of T. □ 
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Proof of Corollary 14.21 


The sufficiency is immediate; see the proof of Lemma 12.151 For the necessity, we apply 
Proposition 13.51 and Proposition 14.11 to obtain that there are positive functions g m and an 
J-nntegrable function a such that 


S(x,y) 



gm(v)V m (v,y) d v + a(y), 


for almost all (x,y) € A x 0, where z £ int(A) is a star point of int(A). Let t = T[F ) and 
x m t m ■ The strict consistency of S implies that S(t, F ) < S(t \,..., t m _ i,x m , t m+ 1 ,..., t m ). 
This means S m (t m ,F) < S m (x m ,F ) with S m (x m ,y) := f*™ g m (v)V m {v,y) dv +±a(y). □ 


Proof of Theorem 15.21 


(i) The second part of Theorem 15.21 (ii) implies the /c-elicitability of T. 

(ii) Let S\ A x R —> M be of the form (15.21) . Q k be convex and the functions at 
m be increasing. Let F £ F, x = (xi,... ,x k ) £ A and set t = (t\. . . . ,t k ) = T{F ), 
w = min (x k ,t k ). Then, we obtain 

S{x,y) = 

k—1 / \ 

= ^2 (^{y < X A - Qr) ( G r (x r ) + —G k {w)(x r -y)j - t{y < x r }G r (y) 
r=l \ q r J 

( k ~ l \ 

T (G k (x k ') Gfc(rc)) ( X k T ^ ' (l{?/ — y ) QmXm) I 

V m=l Qm J 

~ Qk{x k ) + G k (w)(x k -y) + a(y). 

This implies that S(x, F) — S(t , F) = R\ + R 2 with 


k—1 / \ 

Ri = ^ (T(x r ) - g r ) ( G r (x r ) + —G k {w)x r ) 

r=l \ q r J 


n 


G r (y) + —G k (w)y ) dF(y), 

Qr 

k-1 


( /t 1 / 

Xk + ^2 — ( / (xm-y) dF(y) 

m=l 


Qm%m 


Gkfak) + Qkfj'k) “1“ Gfc^W^Xk tk). 


We denote the rth summand of by and suppose that t r < x r . Due to the assumptions, 
the term G r (y) + ^ Gk(w)y is increasing in y G which implies that 

£r ^ Qr) (r r ) “b Gk (w'jXr^ 

— (F(x r ) — i ? (tr)) ^G r (x r ) + — Gfc(u;)x r ^ = 0. 
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Analogously, one can show that £ r > 0 if x r < t r . If A has a unique g r -quantile and the 
term G r (y) + ^ G k (w)y is strictly increasing in y, then we even get > 0 if x r ^ t r . 

Now consider the term i? 2 - Splitting the integrals from oo to x m into integrals from 
—oo to t m and from t m to x m and partially integrating the latter, we obtain 


^ ^ ptm ^ r%m \ 

Xk+^Pmitm-Xm -/ V d F(y) H-/ F(y) d y ) 

m=1 \ Qm ./-oo Qm Jtm / / 


Gk(%k) “ 1 “ G^{w){X}q tfc) 


= ( G k (x k ) - G k (w)) I x k - t k + ^2 Pm (t m - x m + — I F(y) dy ) ] 

\ m= 1 ' ^' / 

- Gk(xk ) + Gkitk) + G k (w)(x k - t k ) 

— ( Gkixk) Gk(w )) (xfc ffc) Gk{x k ) “I - Gk(t k ) “I - G k (w)(x k tfc) 

— Gkitk) Gk(xk) G k {xk){t k X k ) F 0. 


The hrst inequality is due to the fact that (i) G k is increasing and (ii) for x m / t m we 
have h ftS F (y) d y>x m ~ t m with strict inequality if F has a unique g m -quantile. The 
last inequality is due to the fact that Gk is convex. The inequality is strict if x k ^ t k and 
if G k is strictly convex. 

(iii) If / denotes the density of F, it holds that 

i r F ~ 1 {oi) 

ES a (F) = - yf{y)dy, a £ (0,1]. (7.4) 

^ J —oo 


We first show the assertions concerning V given at m- Let F £ F with density 
f = F' and let t = T{F). Then we have for m £ {1,..., A; — 1}, x £ A, that V m (x, F) = 
F(x m ) — q m which is zero if and only if x m = t m . On the other hand, using the identity 
at (17.41) 


k —1 

Vk{ti,...,t k -i,x k ,F) =x k ~ ^2 — / yf{y) dy = x k - t k . 

m= 1 Qm 


Hence, it follows that V is a strict J r -identihcation function for T. Moreover, V satisfies 
assumption (V[3]), and we have for m £ {1,..., k — 1}, l £ {1,... , k} and x £ int(A) that 
diVm(x,F) = 0 if l / m and d m V m (x,F) = f(x m ), d m V k (x,F) = -(p m /q m )x m f(x m ) and 

d k Vk(x,F) = I- 

From now on, we assume that t = T{F) £ int(A). Let S' be a strictly J r -consistent 
scoring function for T satisfying (32|). Then we can apply Theorem 13.21 and Corollary 13.41 
to get that there are locally Lipschitz continuous functions h[ m : int(A) —> R such that 
(13.21) and (13.31) hold. If we evaluate (13.31) for l = k, m £ {1,... , k — 1} at the point x = t 
we get 

hkm{t)d m V m (t,F) + h k k{t)d m V k {t, F) = h mk (t)d k V k (t, F), 


which takes the form hk m (t) f (t m ) — hkkity^tmf (t m ) = h mk (t). Invoking assumption (VHJ) 
for (Li,..., I4-i), we get that necessarily h mk {t) = 0 and h km {t) = {pm/Qm)t m hkk(t). So 
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with the surjectivity of T we get for x £ int(A) that 

Knk(x) = 0, h km (x) = — x m h kk (x) for all m £ 1}. (7.5) 

Qm 

Now, we can evaluate (13.31) for m, l £ {1,..., k — 1}, m / l, at x = t and use the first part 
of (17.51) to get that h m i(t)f(ti ) = hi m (t) f (t m ) . Using again the same argument, we get for 
x £ int(A) that 


h m i(x) = 0 for all m, l £ {1,..., k — 1}, l / m. (7-6) 

At this stage, we can evaluate (13.31) for l £ {1,..., k — 1}, m £ {1,... , k}, m / l, for some 
x € int(A). Using (17.51) and (17.61) we obtain 

k 

^2 (pi h mi(x ) - d m hu(x))Vi(xi, F) = 0. 

7=1 

Invoking assumption (MU) and using (17.51) and (17.61) . we can conclude that for almost all 
x £ A, 

dihmmix) = 0 for all i £ {1,..., k — 1}, m £ {1,..., k}, l ^ r. (7-7) 

and 

dkhu{x ) = —h kk (x) for all l £ {1,... , k — 1}. (7.8) 

qi 

Equation m for m = k shows that there is a locally Lipschitz continuous function 
g k : A' k —> R such that for all (x\,... ,x k ) £ int(A), we have h kk {x\,...,x k ) = g k (x k ). 
Equation (17.81) together with (17.71) gives that for l £ {1,..., k — 1}, and (aq,..., x k ) £ 
int(A), we obtain hu(xi ,..., x k ) = ( Pi/qi)G k (x k ) + gi(xi), where gy. A) —>• R is locally 
Lipschitz continuous and G k : A' k —>• R is such that G' k = g k . 

Knowing the form of the matrix-valued function h, we can apply the second part of 
Proposition 13.51 Let z £ int(A) be some star point. Then there is some J r -integrable 
function b : R —» R such that 


s ( x ,y) = '22 (—Gk(z k )+g r (v)j(t{y<v}-q r )dv (7.9) 

r= ^ Jz r xQ.'k' J 

k -1 

+ (G k (x k ) - G k (z k )) ^2 — (^m(l{y < X m } - q m ) - yHy < Xm }) 

m=1 qm 

+ G k (x k )x k - G k (x k ) + b(y), 

for almost all (x, y) where G k : A' k —> R is such that Q’ k = G k . One can check by a 
straightforward computation that the representation of S at (17.91) is equivalent to the one 
at m upon choosing a suitable J r -integrable function a: R —>■ R. 

It remains to show that G k is strictly convex and that the functions given at (|5.3D are 
strictly increasing. To this end, we use Lemma 12.91 part (i). Let D = {s £ t + sv £ 
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int(A)}, and let v = (iq,..., Vk) G § fc 1 and without loss of generality assume Vk > 0. We 
define ip: D —>• M by ip(s) := S(t + sv, F ), that is, 


fc-i 


ip(s) = (— G'fc(^fc) + 9r(v)j (F(v) - q r )dv 

r = \ J z r Q r 

k -1 

+ ( Gk(sk ) — Gk{zk )) ^—(s m {F{s m ) — q m ) 

. Qm \ 


Qm 

m= 1 

+ SkGk(sk) — Gk{sk) + d(U), 


where we use the notation s = t + sv. The function ip has a minimum at s = 0. Hence, 
there is e > 0 such that ip'(s) < 0 for s G (—£,0) and ip'(s ) > 0 for s G (0,e). If Vk = 0, 
then 


k -1 

^'(s) = ^2( F (Sr) - qr)Vr(g r (Sr) 

r =1 



Choosing v as the mth standard basis vector of M. k for m G {l,...,/c — 1}, we obtain that 
g r (s r ) + ^-Gk(sk ) > 0. Exploiting the surjectivity of T we can deduce that the functions 
at (15.31) are strictly increasing. On the other hand, if v is the fcth standard basis vector, 
we obtain that ip'(s) = gk(,Sk)s. Again using the surjectivity of T we get that gk > 0 which 
shows the strict convexity of Gk- □ 


Proof of Corollary 15.41 

For the first part of the claim, note that if //({l}) = 1, then coincides with the 
expectation and is thus 1-elicitable. If //({l}) = 0, the assertion of the corollary is 
a direct consequence of Theorem 15.21 (i). If A := /x({l}) G (0,1), then we can write 
g = J2rn=l Pm,3 qrn + Adi, where Pm G (0,1), Ylm=\ Pm = 1 - A, G (0,1) and the qm S are 
pairwise distinct. Define the probability measure g := J2ni=i fzjfiqm- Using Theorem 15.21 
(i), the functional (T(,..., T(,_ 1 ): F —> M fc_1 is (k — l)-elicitable where T' n {F) := F~ 1 (q m ), 
m G {1,... , k — 2}, and T^,_ 1 (F) = v^(F). Using Lemma l2.15l we can deduce that the func¬ 
tional {T [,..., T' k _ l: i^): F —> M k is /c-elicitable. Note that = (1 — A )vp + Az^. Hence, 
we can apply Proposition 12.131 to deduce that the functional T = (Tj,...,7*.) : F —> is 
fc-elicitable where T m = T' n . m G {1,... , k — 2}, Tk ~i = ns 1 and Tf.. = □ 

Proof of Corollary 15.51 

The sufficiency follows directly from Theorem 15.21 We will show that G 2 is necessarily 
bounded below. Suppose the contrary. For the action domain Ao, we have X2 = [x 2 , 00 ), 
therefore, for X 2 < x\ < x[ (15.31) yields 

-00 < Gi(xi) - G 1 (x' 1 ) < —G 2 (x 2 )(x' 1 — xi). 

a 

Letting x 2 -> —00 one obtains a contradiction. Let C 2 = \\m. X2 ^ f _ 00 G 2 (x 2 ) > — 00 . Then, 
by (15.31) . we obtain that Gi(xi) + (C 2 ja)x\ is increasing in x\ G M. We can write S at 
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(15.51) as 


S(xi,x 2 ,y) = (1 {y < xi} - a) ^Gi(xi) + ^x^ - l{y < xi} ^Gi(y) + ^-y 


+ (G 2 (x 2 ) - C 2 ) (^1 {y < xi}(xi - y) - (xi - x 2 )) 
- (02(x 2 ) - C 2 X 2 ) + a(y). 


The last expression is again of the form at (15.51) with an increasing function G?i(xi) = 
Gi(xi) + (C 2 /a)x 1 and with G 2 (x 2 ) = G 2 (x 2 ) - C 2 > 0. □ 
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